Assistant Professor | School of Information Science
Internal validity is the extent to which we can infer that that IV caused the change in the DV.
Internal validity depends on the strength or soundness of the design and influences whether one can conclude that the independent variable or intervention caused the dependent variable to change
What are the two characteristics used to evaluate internal validity?
Internal validity refers to the extent that the independent variable, treatment, or intervention caused the change in the dependent variable. We evaluate it based on…
What are the broad threats to internal validity?
Remember the sheet I gave you!
Measurement is often unreliable. Participants who score low on a measure may score higher (closer to the mean) the next time.
Participants drop out for a number of reasons, but it influences group composition.
Problems are created when groups are assigned based on similarities - not randomness.
External validity is the extent to which samples, settings, and variables can be generalized beyond the study.
All the participants of theoretical interest to the research and to which he or she would like to generalize.
Also called sampling frame
The group of participants you actually have access to, perhaps through a list or directory.
The smaller group of participants selected from the larger accessible population by the researcher and asked to participate in the study.
The participants that complete the study and whose data are actually used in the data analysis and in the report of the study’s results.
What are the characteristics of a sample frame researchers should evaluate in determining its usefulness?
The sampling frame represents an exhaustive list of the participants that a researcher could realistically access for a study.
How do we ensure that a sample is representative of the target theoretical population?
Random selection is important for high external validity.
Random assignment is important for high internal validity.
Everyone has a known, nonzero change of being chosen:
No method to estimate probability of being included:
What are the two types of external validity?
External validity is the extent to which samples, settings, and variables can be generalized beyond the study.
Whether the conditions, settings, times, testers, or procedures are rep of natural conditions and so forth and, thus, whether results can be generalized to real life outcomes.
AKA is the research environment similar to the natural environment? Does the manipulation of the IV feel real to the participants?
Power is the probability of detecting an effect, given that the effect is really there.
In other words, it is the probability of rejecting the null hypothesis when it is in fact false.
The probability of a Type I error (reject a true null).
For a test with a level of significance of 0.05 = 1/20, a true null hypothesis will be rejected one out of every 20 times.
We are willing to live with a 5% chance that we will conclude that there is a difference when there really isn’t (we are 95% confident).
The probability of a Type II error (fail to reject a false null)
The probability that we would accept the null hypothesis even if the alternative hypothesis is actually true
If power is .80 or 80%, then beta is .2 or 20%
What is the difference between a conceptual definition and an operational definition?
These are not interchangeable
When we discussed the IV, we discussed designing experiments that allow us to control the variable of interest.
In other cases, the variables we are interested in might be continuous.
This is most often the dependent variable (i.e., what is thought to be changed by the IV)
4 levels of measurement used to describe the range and the relationship among the values a variable can take.
Depending on the level, the data can mean different things.
For example, the number 2 might indicate a score of two; it might indicate that the participant was a male; or it might indicate that the participant was ranked second in the class.
The normal curve provides a model for the fit of the distributions of many of the dependent variables used in the behavioral sciences.
X axis: scores or responses on an ordered variable from very low to very high
Y axis: number of participants with a particular response
Think of it as a probability distribution
What is the probability of a participant’s typical response?
If externally valid, should align with probability distribution of the theoretical population.
5 properties that are ALWAYS present:
Ordered from low to high, responses are at least approximately normally distributed in the population from which the sample as selected
A number of statistical tests rely on the assumption that the distribution is normal
When data are normally distributed, the mean, median, and mode are all the same and in the center of the distribution.
If data are normal, mean is the stat to use.
Variability describes the spread or dispersion of the scores.
If all scores are the same, there is no variation.
In your research, you test to see if an IV can explain the variation in the DV.
Is the IV the reason that the values of the DV varied from person to person?
Standard Deviation is the most common. It is a measure of how scores vary about the mean.
Gives you a sense of the spread of the data
Remember the normal curve?
If \(\overline{x}\) = 70 and s = 15.22
For population: \(\mu\) = 70 and \(\sigma^2\) = 15.22
Can convert a normal curve into a standard curve by setting mean equal to zero and SD equal to 1
Calculated by subtracting the mean from each data point, and then dividing the difference by the standard deviation of the population
Proportions are always the same. This allows comparisons for curves with different means
A standard score that indicates the number of standard deviation units that a person’s score deviates from the group mean
Let’s say \(\overline{x}\) = 70 and s = 15.22 represent values for student test scores.
Allows us to categorize outliers (if z > 3.29 or 3 standard deviations)
Allows you to compare scores on different tests.
What techniques and instruments do we use to collect data?
Major Types of Data Collection Techniques:
Direct Observation
A set of problems with right or wrong answers
Survey research methods!
WHY is the measurement theory important in quantitative statistical analysis??
BOTTOM LINE: Levels of measurement influence the appropriate use of statistics!
In general, it is advisable to select instruments that have been used in other studies if they have been shown to produce reliable and valid data with the types of participants and for the purpose that you have in mind.
If those aren’t available…
Scale development is useful for capturing not directly observable concepts
For your projects…
General Guidelines
Fundamental goal at this stage is to sample systematically all content that is potentially relevant to the target construct.
You will drop weak items but cannot add them back.
Recommend Likert scale but you can use semantic differential if you really want.